Skip to content

Conversation

@NimaSarajpoor
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor commented Aug 1, 2023

Fix #881

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
After merging #893, I will update this branch and run the tests again.

@codecov
Copy link

codecov bot commented Aug 1, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (900b10c) 98.93% compared to head (5dcc5c4) 98.93%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #892   +/-   ##
=======================================
  Coverage   98.93%   98.93%           
=======================================
  Files          84       84           
  Lines       14236    14296   +60     
=======================================
+ Hits        14084    14144   +60     
  Misses        152      152           
Files Changed Coverage Δ
stumpy/core.py 100.00% <100.00%> (ø)
stumpy/gpu_stump.py 100.00% <100.00%> (ø)
tests/test_precision.py 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator Author

@NimaSarajpoor NimaSarajpoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanlaw
Do you think we can come up with a more elegant solution?

npt.assert_almost_equal(D_ref, D_comp)


def test_snippets():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to show the purpose of this particular test function.

For example, we may write:

# This test function raises an error due to small, cumulated loss of precisions in `snippet_profiles`. 
# To avoid that, the code needs to be revised to reduce the loss of precision. 

@seanlaw
Copy link
Contributor

seanlaw commented Aug 2, 2023

@seanlaw Do you think we can come up with a more elegant solution?

It feels like a hack and has a code smell. In this instance, how do we know which direction (AB vs BA) is suffering from a loss of precision?

@NimaSarajpoor
Copy link
Collaborator Author

It feels like a hack and has a code smell.

I agree!

In this instance, how do we know which direction (AB vs BA) is suffering from a loss of precision?

Need to dig deeper :) Will provide an update.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 3, 2023

[update]

how do we know which direction (AB vs BA) is suffering from a loss of precision?

For the case added to the test function, the answer is Both! However, the loss of imprecision in one resulted in increasing the ref by +1e-14, and in the other, resulted in increasing the ref by -1e-14. since the delta value are not identical, it resulted in a change in the final outcome!


I followed the code all the way and noticed this particular loss of precision is coming from the function
core._calculate_squared_distance in which the distance between T[i : i + s] and T[j : j + s] is not the same as the distance between T[j : j + s] and T[i : i + s], where i==2, j==10.

Note 1: I set fastmath to False but that did not help
Note 2: I tried something else locally. Going to push it here and see how it goes.

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
I think I got to the bottom of it!

# A failure case

m = 10
k = 3
s = 3

seed = 332
np.random.seed(seed)
T = np.random.uniform(-1000.0, 1000.0, [64])

M_T, Σ_T = stumpy.core.compute_mean_std(T, s)

i, j = 2, 10

# denom for computing the rho (see func `core._calculate_squared_distance` in the branch `main`)
denom_v1 = print(s * Σ_T[i] * Σ_T[j])
denom_v2 = print(s * Σ_T[j] * Σ_T[i])
# denom_v1 and denom_v2 have a very small difference

# denom for computing  m * rho (see the last commit in this PR) 
denom_v1 = print(Σ_T[i] * Σ_T[j])
denom_v2 = print(Σ_T[j] * Σ_T[i])
# denom_v1 and denom_v2 are exactly the same

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 3, 2023

@seanlaw
[update]

As a follow up to my previous commnet:

Alternative to the last commit:
I think if we replace

https://github.com/TDAmeritrade/stumpy/blob/6c367accc523708f6e4ec6701da76447cc5b1b0b/stumpy/core.py#L1100

with
denom = σ_Q * Σ_T * m

The problem should be solved since, in this case, swapping the values σ_Q and Σ_T does not change the value of denom. (I do not know why though 😄 Something is happening behind the scene for sure!)


Additional note 1: I also tried float(m) but that does not change anything!
Additional note 2: Adding parenthesis also does not change anything!

@seanlaw
Copy link
Contributor

seanlaw commented Aug 3, 2023

@NimaSarajpoor Can you tell me a set of values for σ_Q, Σ_T, and m that would cause denom to be different if we changed the order? I'd like to post this in the numba forum

With those same values, are you able to check if gpu_stump might also suffer from the same issue:
https://github.com/TDAmeritrade/stumpy/blob/6c367accc523708f6e4ec6701da76447cc5b1b0b/stumpy/gpu_stump.py#L180-L185

@seanlaw
Copy link
Contributor

seanlaw commented Aug 3, 2023

I think I got to the bottom of it!

Can you please add this test case to test_precision.py?

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 3, 2023

@seanlaw

I'd like to post this in the numba forum

I think this is related to just python (I ran my simple experiment outside of numba!)

Can you tell me a set of values for σ_Q, Σ_T, and m that would cause denom to be different if we changed the order?

How about this:

lst = []
for seed in range(1000):
    np.random.seed(seed)
    vals = np.random.uniform(-1000, 1000, size=2)
    x = vals[0]
    y = vals[1]
    m = 3

    A = m * x * y 
    B = m * y * x
    
    if A == B:
        continue
    else:
        lst.append(seed)

Then, run the same thing but change the order of multiplication. Check lst in both cases.
(Spoiler alert: the lst stays empty if you change the order of multiplication.)

In case you just need one example: 3 * 650.912209452633 * 722.0717285148526

With those same values, are you able to check if gpu_stump might also suffer from the same issue:

Do you think I should still check this given that the issue is coming from python?

Can you please add this test case to test_precision.py?

Will do it 🙂

@seanlaw
Copy link
Contributor

seanlaw commented Aug 3, 2023

Thanks @NimaSarajpoor. It looks like this would make things more precise:

A = decimal.Decimal(3) * decimal.Decimal(650.912209452633) * decimal.Decimal(722.0717285148526)
B = decimal.Decimal(3) * decimal.Decimal(722.0717285148526) * decimal.Decimal(650.912209452633)

However, note that you'll have to cast it back to float(A) and float(B) before that value can be used to combine with other float values

What if we did:

denom = float(decimal.Decimal(m) * decimal.Decimal(σ_Q) * decimal.Decimal(Σ_T))

I wonder if this would affect the performance since it is a deeply nested function that gets called a lot. I am reading some stories that decimal can increase the time 50-100x. Can you please check by using a larger T input?

@NimaSarajpoor
Copy link
Collaborator Author

Just to make sure we are on the same page: are we talking about comparing the performance of stump? (And maybe gpu_stump too?)

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 3, 2023

(New lesson for me: avoid hack as much as possible! Now that I think more, and after that I read your "Decimal" approach, I can see that swapping the variables is still like a hack! These hacks should be our last resort! Thank you for the lesson !!)

@seanlaw
Copy link
Contributor

seanlaw commented Aug 3, 2023

Just to make sure we are on the same page: are we talking about comparing the performance of stump? (And maybe gpu_stump too?)

So, since the problem is really in _calculate_squared_distance, I would call the parent function _calculate_squared_distance_profile and pass in sets of QT, μ_Q, σ_Q, M_T, Σ_T that are of varying lengths (which would then execute _calculate_squared_distance repeatedly) to see if using decimal has any aggregate/cumulative affect on the computation time.

Does that make sense? Please make sure to turn numba back on (in case numba ignores decimal). If the added time is negligible then I think we have a pretty good/robust solution that balances accuracy with computational time.

(New lesson for me: avoid hack as much as possible! Now that I think more, and after that I read your "Decimal" approach, I can see that swapping the variables is still like a hack! These hacks should be our last resort! Thank you for the lesson !!)

I'm glad that you understood. It is certainly easier to hack together a solution but it is another thing to:

  1. Isolate the problem (what you did) and then
  2. Understand what causes the problem (i.e., general issues with floating point representation)
  3. The best/typical options that others have used to address it cleanly (what I did)
  4. Try those solutions

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
I just noticed that we have another similar case when computing the distance. Kindly see the line below:

https://github.com/TDAmeritrade/stumpy/blob/6c367accc523708f6e4ec6701da76447cc5b1b0b/stumpy/core.py#L1103

m * μ_Q * M_T is the similar part where changing the order can change the precision. For now, I made a small change to address this and the one in denom. I know that is a hack but it is a cleaner one compared to previous hacks. I will consider it as our last solution.

I will go and apply your proposal to the code for both m * μ_Q * M_T and the denom m * σ_Q * Σ_T. I will provide an update.

@seanlaw
Copy link
Contributor

seanlaw commented Aug 4, 2023

I will go and apply your proposal to the code for both m * μ_Q * M_T and the denom m * σ_Q * Σ_T. I will provide an update.

Sounds good!

@NimaSarajpoor
Copy link
Collaborator Author

[update]

Apparently, numba does not support decimal types. See: this Github issue and this post in numba Discourse.


[just for our future reference if needed]
I also compared the running time of the original version with this PR. They are the same.

hack

# code
n_values = 100000 * np.arange(11)
n_values[0] = 5000

print('n_values:', n_values)


m = 100


running_time=[]
for i in range(len(n_values)):
    np.random.seed(i)
    n = n_values[i]
    T = np.random.uniform(-1000, 1000, size=n)
    j = np.random.randint(n - m + 1)
    Q = T[j : j + m]

    QT = stumpy.core._sliding_dot_product(Q, T)
    

    Q_subseq_isconstant = False
    T_subseq_isconstant = np.zeros(n - m + 1, dtype=bool)
    
    M_T, Σ_T = stumpy.core.compute_mean_std(T, m)
    
    
    lst = []
    for _ in range(5):
        ts = time.time()
        _calculate_squared_distance_profile(
            m,
            QT,
            M_T[j],
            Σ_T[j],
            M_T,
            Σ_T,
            Q_subseq_isconstant,
            T_subseq_isconstant,
        )
        te = time.time()
        lst.append(te - ts)
    
    running_time.append(np.mean(lst))

@seanlaw
Copy link
Contributor

seanlaw commented Aug 4, 2023

Thank you. I find it rather surprising that adding parentheses did not solve the problem.

Also, can you please add the precision unit test above for _calaculate_squared_distance?

@NimaSarajpoor
Copy link
Collaborator Author

Thank you. I find it rather surprising that adding parentheses did not solve the problem.

Yes, it is surprising indeed!


Btw, I tried np.float128 as follows:

denom = m * np.float128(σ_Q * Σ_T) 

In an isolated test, it seems that it can solve the precision issue. But, when I want to use it in njit func, I get numba error:

 464         msg = (f"{str(e).rstrip()} \n\nThis error may have been caused "
    465                f"by the following argument(s):\n{args_str}\n")
    466         e.patch_message(msg)
--> 468     error_rewrite(e, 'typing')
    469 except errors.UnsupportedError as e:
    470     # Something unsupported is present in the user code, add help info
    471     error_rewrite(e, 'unsupported_error')

File ~/miniconda3/envs/stumpypy39/lib/python3.10/site-packages/numba/core/dispatcher.py:409, in _DispatcherBase._compile_for_args.<locals>.error_rewrite(e, issue_type)
    407     raise e
    408 else:
--> 409     raise e.with_traceback(None)

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
float128
During: typing of get attribute at /tmp/ipykernel_13752/2182441483.py (58)

File "../../../../../tmp/ipykernel_13752/2182441483.py", line 58:
<source missing, REPL/exec in use?>

During: resolving callee type: type(CPUDispatcher(<function _calculate_squared_distance at 0x7fc6a21fc8b0>))
During: typing of call at /tmp/ipykernel_13752/2182441483.py (121)

During: resolving callee type: type(CPUDispatcher(<function _calculate_squared_distance at 0x7fc6a21fc8b0>))
During: typing of call at /tmp/ipykernel_13752/2182441483.py (121)


File "../../../../../tmp/ipykernel_13752/2182441483.py", line 121:
<source missing, REPL/exec in use?>

@NimaSarajpoor
Copy link
Collaborator Author

Also, can you please add the precision unit test above for _calaculate_squared_distance?

will do.

@seanlaw
Copy link
Contributor

seanlaw commented Aug 4, 2023

I find it rather surprising that adding parentheses did not solve the problem.

So, when I do:

print(3 * 650.912209452633 * 722.0717285148526)
print(3 * 722.0717285148526 * 650.912209452633)

then the output is different:

1410015.9125726535
1410015.9125726533

However, if I add parentheses:

print(3 * (650.912209452633 * 722.0717285148526))
print(3 * (722.0717285148526 * 650.912209452633))

the issue appears to be resolved:

1410015.9125726535
1410015.9125726535

Similarly, adding parentheses here:

import numpy as np

np.random.seed(0)
vals = np.random.uniform(-1000, 1000, size=2)
x = vals[0]
y = vals[1]
m = 3

A = m * (x * y)
B = m * (y * x)

Also produces identical results

print(A)
print(B)

126049.76376646347
126049.76376646347

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 5, 2023

@seanlaw
Do you like the new test function test_distance_symmetry_property_in_gpu [added to check the precision in gpu]? I tested in colab. The assestion fails when the gpu_stump of the main branch is used.

Adding parentheses [for multiplication in GPU] did not resolve the assertion failure. But, when I re-order the multiplication, it resolves the loss-of-precision issue. Can you please try it out on your end as well?

@seanlaw
Copy link
Contributor

seanlaw commented Aug 5, 2023

Adding parentheses [for multiplication in GPU] did not resolve the assertion failure. But, when I re-order the multiplication, it resolves the loss-of-precision issue. Can you please try it out on your end as well?

Excellent. I think the GPU test looks great! Testing on Colab now

Copy link
Contributor

@seanlaw seanlaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please the latest commit into this repo? This way the test.sh is the the most up-to-date for testing the GPU on colab (I don't think test_precision is executed for unit testing in this repo when I clone it on colab)

@NimaSarajpoor
Copy link
Collaborator Author

GPU on colab (I don't think test_precision is executed for unit testing in this repo when I clone it on colab)

Sure. [In the meantime, you may just use the flag custom to run the test for test_precsion.py only]

@seanlaw
Copy link
Contributor

seanlaw commented Aug 6, 2023

Excellent. I think the GPU test looks great! Testing on Colab now

test_precision.py is passing for me on colab. After the tests pass here (and you check the performance), I think we are good to merge, right @NimaSarajpoor?

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 6, 2023

After the tests pass here (and you check the performance), I think we are good to merge, right @NimaSarajpoor?

Right..I think we should be good.


I noticed the assertion failure in the minimum version testing is for:

# os: linux
# python 3.7 
# numpy 1.17
# note: no numba

import numpy as np
seed = 332
np.random.seed(seed)
T = np.random.uniform(-1000.0, 1000.0, [64])
m = 3

i, j = 0, 51

ij_dot = np.dot(T[i:i+m], T[j:j+m])
ji_dot = np.dot(T[j:j+m], T[i:i+m])

npt.assert_almost_equal(ij_dot, ji_dot, decimal=15)

And, it fails. I tried to change the version of python and/or numpy to find what version preserves the symmetry property of dot product but couldn't find it as I still got failure! [which is strange since when I checked it in colab with python 3.10.12 and numpy 1.22.4, I get no error!] I will look into it.

Is it possible that a specific versio of numpy works with a specific version of python considering their patches?

@seanlaw
Copy link
Contributor

seanlaw commented Aug 6, 2023

Is it possible that a specific versio of numpy works with a specific version of python considering their patches?

It's certainly possible. However, you're only calling np.dot, which I doubt has been changed in a long, long time and, my guess, it shouldn't have issues with backwards compatibility.

I think this is worth posting as an issue on the numba Github page (and maybe even StackOverflow)

@seanlaw
Copy link
Contributor

seanlaw commented Aug 6, 2023

@NimaSarajpoor In case it might make our lives easier, I was thinking about bumping the minimum Python version since Python 3.7 reached end of life about a month ago. Of course, it'll require some work to figure out what other corresponding minimum numpy, numba, and scipy versions should accompany Python 3.8

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 7, 2023

I think this is worth posting as an issue on the numba Github page

I created a new conda env with just python and numpy (so no numba), and got the assetion failure.

I noticed the assertion failure in the minimum version testing is for:

# os: linux
# python 3.7 
# numpy 1.17
# note: no numba

import numpy as np
seed = 332
np.random.seed(seed)
T = np.random.uniform(-1000.0, 1000.0, [64])
m = 3

i, j = 0, 51

ij_dot = np.dot(T[i:i+m], T[j:j+m])
ji_dot = np.dot(T[j:j+m], T[i:i+m])

npt.assert_almost_equal(ij_dot, ji_dot, decimal=15)

The assertion will not fail if I do T=np.random.uniform(-1000.0, 1000.0, [64]).astype(np.float128)


I tried to do:

conda create -n py38 python=3.8
conda activate py38
conda install numpy

And, I tried the case above and got assetion failure. [It seems strange to me]


[not sure if it is related or not]
I noticed that I get no asserion failure in two of my conda env. In both, I have numpy-base package as well.

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 7, 2023

n case it might make our lives easier, I was thinking about bumping the minimum Python version since Python 3.7 reached end of life about a month ago. Of course, it'll require some work to figure out what other corresponding minimum numpy, numba, and scipy versions should accompany Python 3.8

Dropping support for 3.7 may not be a bad idea as I noticed several packages did the same. I am still curious to understand why setting up my conda env with python 3.8 and installing a numpy package does not resolve the assertion failure issue related to np.dot

@NimaSarajpoor
Copy link
Collaborator Author

@seanlaw
Is there anything I can do? Should I start with changing the minimum version in requirement.txt and see if the tests pass? (And, of course, change Python versions in gitHub-action.yml file.)

@seanlaw
Copy link
Contributor

seanlaw commented Aug 7, 2023

Is there anything I can do? Should I start with changing the minimum version in requirement.txt and see if the tests pass?

See issue #897

@seanlaw
Copy link
Contributor

seanlaw commented Aug 11, 2023

@NimaSarajpoor What do you think? Is this ready to be merged?

@NimaSarajpoor
Copy link
Collaborator Author

I think it is ready 👍

@NimaSarajpoor
Copy link
Collaborator Author

Btw, one of the new test func has no comment. Is that okay?

@NimaSarajpoor
Copy link
Collaborator Author

NimaSarajpoor commented Aug 11, 2023

[update]

Btw, one of the new test func has no comment. Is that okay?

I took care of it.

Ready for review and merge [if nothing else needs to be done here]

@seanlaw seanlaw merged commit d4c6a21 into stumpy-dev:main Aug 11, 2023
@seanlaw
Copy link
Contributor

seanlaw commented Aug 11, 2023

Thanks for working on this @NimaSarajpoor !

@NimaSarajpoor NimaSarajpoor deleted the fix_snippet branch September 4, 2023 03:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snippets Unit Test Assertion Failure

2 participants